# ViT fine-tuning

Pokemon Classifier Gen9 1025
Apache-2.0
This is a fine-tuned ViT (Vision Transformer) model specifically designed for Pokémon image classification. The model has been trained to classify Pokémon images up to the ninth generation (1025 species).
Image Classification Transformers English
P
skshmjn
22.91k
1
UL Exterior Classification
Apache-2.0
An image classification model fine-tuned based on Google's ViT-base-patch16-224 model, achieving an accuracy of 68.97% on the evaluation set
Image Classification Transformers
U
sharmajai901
319
1
Nsfw Image Detector
Apache-2.0
A NSFW (Not Safe For Work) content detection model fine-tuned based on Google Vision Transformer, capable of identifying 5 types of image content
Image Classification Transformers
N
LukeJacob2023
68.26k
17
Multilabel V3
Apache-2.0
A multi-label classification model fine-tuned from google/vit-base-patch16-224-in21k with an accuracy of 73.7%
Image Classification Transformers
M
Madronus
19
0
Vit Base Patch16 224 Finetuned Eurosat
Apache-2.0
Image classification model fine-tuned on EuroSAT remote sensing image dataset based on Google's ViT base architecture
Image Classification Transformers
V
sabhashanki
18
0
Vit Base Patch16 224 In21k Male Or Female Eyes
Apache-2.0
This is a binary classification model based on the ViT architecture, designed to distinguish between male and female eye images.
Image Classification Transformers English
V
DunnBC22
37
1
Spoofing Vit 16 224
Apache-2.0
An image anti-counterfeiting detection model based on ViT architecture, achieving 70.88% accuracy after fine-tuning on an unknown dataset
Image Classification Transformers
S
venuv62
59
0
Mnist Digit Classification 2022 09 04
Apache-2.0
This is a MNIST handwritten digit classification model based on the Vision Transformer (ViT) architecture, achieving 99.23% accuracy after fine-tuning on the MNIST dataset.
Image Classification Transformers
M
farleyknight
740
0
Vit Base Roman Numeral
Apache-2.0
A ViT-based image classification model for Roman numerals, fine-tuned on the farleyknight/roman_numerals dataset with an accuracy of 83.09%
Image Classification Transformers
V
farleyknight
13
0
Tiny Random Vit Finetuned Eurosat
This is a Vision Transformer (ViT) model fine-tuned on an image classification task based on the tiny-random-vit architecture, achieving 66.47% accuracy on the evaluation set.
Image Classification Transformers
T
keithanpai
16
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase